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A summary of recent developments in the field of simulation algorithms for dynamical fermions is given. 
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1. Introduction 

Since Kennedy's complaint 0] that there has 
been little activity and progress in the field 
of fermion simulation algorithms, we have seen 
much work devoted to this algorithmic challenge 
lattice field theory has to face. The most promi- 
nent one is Liischer's suggestion |^ to use a refor- 
mulation of the QCD partition function in terms 
of a number of bosonic field copies. This multi- 
boson technique to simulate dynamical fermions 
has been studied thoroughly in the last years ||- 
pj| and it was established to lead to an exact al- 
gorithm which is competitive to the so far most 
often used Hybrid Monte Carlo (HMC) algorithm 
|T2[ or its variant, the Kramers equation or L2MC 
algorithm ||l3|-|l§. 

On the other hand, with the introduction of 
the multiboson technique, a renewed interest to 
accelerate also the HMC algorithm was stimu- 
lated. Better preconditioner, as discussed by 
Frommer at this conference |l7| , improved inte- 
gration schemes and alternative choices of matrix 
inverters led to a considerable amount of progress. 
On the negative side possible problems with the 
reversibility condit ion of the HMC algorithm has 
been encountered 
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19(-|22|. Finally, simula- 
tions of dynamical fermions with Symanzik im- 
proved actions have already been started. 

In this review talk fermion simulation algo- 
rithms for lattice QCD in the Wilson formulation 
are studied. For lack of space, I have to refer to 
e.g. |l^ for the necessary notations and defini- 
tions. 
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2. The algorithm race 

The question one is presumably most interested 
in, is, which of the partly conceptually very dif- 
ferent but exact algorithms is the fastest in the 
sense that an independent configuration is gener- 
ated within a given amount of computer time. To 
find an answer, we will compare particular imple- 
mentations of the multiboson technique against 
the HMC or the Kramers equation algorithm. 

The last mentioned algorithm is a variant of 
the HMC algorithm. In the continuum the ficti- 
tious time evolution is described by a set of for- 
mal stochastic differential equations in the fields 
(f) and their conjugate momenta tt, 

TT = - T"" + ; = 7^ • (1) 

In eq.(|l|) H denotes the Hamiltonian and a fric- 
tion term with a coefficient 7 appears that can 
be tuned to optimize the algorithm. A Gaussian 
noise t] is fed into the time evolution of the sys- 
tem at each time step. A recent free field analysis 
of generalized molecular dynamics algorithms like 
the Kramers equation has been done in |23| . In 
the case of Wilson fermions the performance of 
the Kramers equation is comparable to the HMC 
algorithm For staggered fermions the situa- 
tion seems to be more favorable for the HMC al- 
gorithm , although no definite conclusion can 
be given at the moment. 

2.1. Scaling of the algorithms 

In order to compare the performance of the al- 
gorithms one would like to understand their scal- 
ing behaviour better. In this review, I will choose 
the lowest eigenvalue Xmin of ( ~ Af^M with 
AI the Wilson-Dirac operator) and the volume 
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V of the lattice as the relevant scaling variables. 
Note that for staggered fermions the correspond- 
ing X„iin would be directly proportional to the 
quark mass squared. The choice of Xmm seems to 
be natural if one has the Schrodinger functional 
(SF) formulation [ p5| in mind, where simulations 
at zero quark masses are possible while Xmin re- 
mains non-zero p^j27[ |. 

number of fields/CG iterations 

The accuracy S of the (Chebyshev) polynomial P 
approximation in the multiboson technique is ex- 
ponential, 6 ^ exp{~2y/en} ||]. Since e has to be 
chosen according to the value of Xmin , in order to 
have a fixed accuracy the number of boson fields 

n l/VXmin- 

The number of fields as a function of Xmin 
can be compared to the number of iterations 
needed in the molecular dynamics algorithms to 
invert the matrix Q^. Fig. 1 shows the time per 
call of the matrix inversion routine in -for this 
discussion- irrelevant units as a function of the 
condition number k = Xmax/Xmin, with X^ax the 
largest eigenvalue of Q^. 
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Figure 1. Time spent in the matrix inversion rou- 
tine for the CG (full circles) and the BiCGstab 
(open circles). 

For the inversion the CG and the BiCGstab 
1^ methods have been used. The double loga- 
rithmic plot shows that both methods are com- 
patible with a linear behaviour and that they ba- 



sically require the same amount of computer time 
for the matrix inversion. Note that the condition 
number is ranging over several orders of magni- 
tude. The slope of the straight line in fig. 1 is 
roughly 0.43 and therefore somewhat better than 
the expected behaviour pOt . 

An important remark is that for a smaller value 
of /? = 5.4 we find, in accordance with p^ , 
that the BiCGstab shows a gain of about 15-20% 
in spite of the fact that Schrodinger functional 
boundary conditions have been used. The obser- 
vation that at /? = 5.6 a comparable performance 
is seen, indicates that the competetivity of the 
two methods depends on the parameters chosen, 
autocorrelation times 

Little to nothing is known about the scaling of 
the autocorrelation time with respect to Xmin- I 
take therefore the freedom of having an optimistic 
point of view and adopt the free field results for 
all the algorithms. This amounts to have an au- 
tocorrelation time T ^ I /\/ Xmin- For the multi- 
boson technique one expects this behaviour for an 
optimized Hybrid-over-relaxation algorithm. For 
the HMC algorithm the same behaviour is found 
in the limit of long trajectories ||3^]. The Kramers 
equation algorithm assumes this with an optimal 
choice of 7 for a single step trajectory p3[ |. 

additional scaling factors 

Both type of algorithms show, unfortunately, an 
additional dependence on the lowest eigenvalue. 
For the multiboson technique it is by now well 
established that the autocorrelation time has a 
linear dependence on the number of boson fields 
and therefore r ~ n ~ l/\/A-mm |@-§,§- 

In the molecular dynamics kind of algorithms it 
is the tuning of the step size St to keep a constant 
acceptance that gives rise to an additional scaling 
factor. One finds, assuming an integrator with 
St^ error that 6t - l/X^J^f„ for the HM C al- 
gorithm and by similar arguments St ^ ^/VXmin 
for the Kramers equation algorithm. Using im- 
proved integration schemes will, of course, soften 
this behaviour. 

volume dependence 

For the multiboson technique one expects a vol- 
ume V behaviour as V{logV)^ §,|. The (logV)^ 
appears in the exact version of the algorithm that 
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is discussed below. To have a constant acceptance 
in this algorithm, the accuracy S ~ l/\/V. There- 
fore the number of fields n ^ logV and since the 
autocorrelation time scales with n we finally find 
the (logV)^ behaviour. 

The volume dependence in the molecular dy- 
namics kind of algorithms comes basically from 
the acceptance behaviour. The free field anal- 
ysis Q gives 

Pace 

erfc{N,ndST^VV), where 
Nmd is the number of molecular dynamics steps. 
From this it follows that the HMC algorithm with 
drNrnd — 1 gives a W^l'^ behaviour whereas 
the Kramers equation algorithm with Nmd — 1 
should give W^^^ . For a fixed acceptance rate 
one finds in practise that the scaling of the step 
size does not contradict the theoretical estimate. 

2.2. Tune up of the algorithms 

There are several improvements that can be 
done to accelerate the "bare" algorithms. The 
most important of these is preconditioning the 
fermion matrix. 

preconditioning 

Using the standard even/odd preconditioning 
one not only finds that the lowest eigenvalue is 
increased by a factor of roughly four but that at 
the same time the largest eigenvalue is lowered by 
a factor of about two. This leads to a factor of 
about 8 improvement for the condition number. 

The consequence for the multiboson technique 
is that the number of fields is decreased whereas 
in the molecular dynamics algorithms the number 
of iterations for the matrix inversion is substan- 
tially reduced. 

update 

For the update parts of both simulation methods 
it turns out that for the multiboson technique one 
has to optimize the mixing ratio of heatbath and 
over-relaxation sweeps QJ^. In the case of the 
molecular dynamics algorithm a better integra- 
tion scheme as suggested by Sexton and Wein- 
garten ]33| gives considerable improvement 

matrix inversion 

At the time of the conference a comparison be- 
tween an exact version of the multiboson tech- 
nique in its hermitian and non-hermitian variant 
is missing. Therefore no definite conclusion about 



which polynomial to choose can be given. 

On the side of the molecular dynamics algo- 
rithms, it seems that the CG and the BiCGstab 
methods show a comparable performance for the 
inversion of whereas the minimal residual is 
doing somewhat worse. Using a higher order inte- 
gration scheme allows to choose larger step sizes. 
It seems therefore to me that for present applica- 
tions there is no real need to use the chronolog- 
ical extrapolation method |Q. If, on the other 
hand, one reaches situations where the step size 
becomes small, this way of finding a good starting 
vector might become relevant. 

It seems that for the inversion of the CG al- 
gorithm is optimal. Therefore the only improve- 
ment on this side may come from better precondi- 
tioning techniques like the Oyanaga ILU precon- 
ditioning in combination with the Eisenstat trick 
lHJ] . An interesting idea is to use this to precondi- 
tion the fermionic force p5| which would allow for 
larger step sizes. It was suggested that one might 
increase the stopping criterion of the inversion 
routine by several orders of magnitude from 
||r|| = 10^*, a standard choice, to ||r|| = lO^-^ 
which still provides a reasonable acceptance rate. 
Of course, by doing this, one has to choose the 
starting vector for the inversion to be always the 
same in order to guarantee the reversibility of the 
algorithm. It is, however, unclear to me, whether 
the drastic change of the residuum by five orders 
of magnitude will not lead to reversibility prob- 
lems in practical simulations. 

2.3. Making the multiboson technique ex- 
act 

There have been several proposals to make the 
multiboson technique exact However, I 

consider the suggestion in p the most promising 
one. One can write the exact partition function 
with the local bosonic action Sb in the form 

Z = J '^U J V<i>^V<^>\det{MP{M))\^e-^^-'^'' .(2) 

The correction factor \det{MP{M))\'^ can be 
written in terms of new Gaussian fields rj, 

J I?,7tp^e-|[A/P(A/)|-„|^ ^ J p^tp^g-Sc (3) 
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which defines the correction action Sc ■ The next 
step is to impose an accept/reject step on the 
correction action Sc- In order not to calculate 
the determinant factor wich appears because we 
are considering probability densities, one has to 
"order" the gauge fields. We say that U succeeds 

[/', U>-U' if SgiU) > Sg{U'). 

The probability P = P^^^' {U, U') is then 
P ' " 



SU(2) 



SU(3) 



(4) 



\det(MP{M))[^'^ 
[ |det(M'P(Af'))P ^ 

This probability density guarantees that in the 
accept/reject step the determinants cancel. In ^ 
the detailed balance proof of the above scheme is 
given. A nice feature of this approach is that one 
can estimate the acceptance rate analytically and 
the data seem to obey this analytical form in a 
convincing manner ||]. 

2.4. Performance comparison 

The two kinds of algorithms have been com- 
pared for two situations. The first one is for QCD 
with gauge group SU(2) [|lO|. Here a hermitian 
polynomial was chosen and no Metropolis test 
was employed. However, the algorithm parame- 
ters were chosen such that the observables agree. 
On the molecular dynamics side, the Kramers 
equation algorithm was used. The tests are per- 
formed at a rather large pion to p-mass ratio of 



0.95. On the other hand, several lat- 



tice sizes 6'^12, 8'^12 and IG"* were taken. The sec- 
ond test has been performed by A. Galli, C. Liu 
and myself with SU(3) as the gauge group. Here 
the multiboson technique had the exactness step 
implemented and its non-hermitian version was 
chosen. The comparison was done on a 8^ lattice 
at /3 = 5.6 and k — 0.1585 which corresponds 
to the critical value It is gratifying to see 

in this case that without any extrapolation the 
numbers of the multiboson simulation come out 
to be completely consistent with the HMC value. 

In fig. 2 the integrated autocorrelation times 
for the plaquette in more machine independent 
units of matrix times vector, Q?", operations is 
shown for the two test cases. Given the large error 
bars, one can say that the algorithms perform 
comparably. It is surprising that the decrease of 
\nim when going from SU(2) to SU(3), does not 
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Figure 2. Performance comparison for the 
multiboson technique and the Kramers equation 
(SU(2)) and the HMC (SU(3)) algorithms. 

seem to increase the autocorrelation time. 



3. More participants in the race 

In this section I will shortly describe alterna- 
tives to the algorithms described above, 
bermions 

Bcrmion simulations p7[ | are done at negative 
flavour numbers Uf. This at first sight strange 
idea appears to be very fruitful in practise. The 
major step forward in this approach is to impose 
non-perturbative matching criteria to relate the 
different negative quark flavours to zero and fl- 
nally to, say, Uf = 2. During the course of in- 
vestigating the bermion method, the authors also 
found a nice way to compute propagators in terms 
of pseudofermions. 

The bermion approach can certainly serve to 
explore the parameter space as they are much 
cheaper than a full QCD simulation. As such 
they can give a very good flrst guess and guide 
the choice of parameters. Of course, at the end 
one would like to perform a real simulation with 
positive nf in order to verify the result. An open 
question in the bermion approach is, whether 
there exists a value of the quark mass at which 
the bermion method breaks down, 
domainwall fermions 
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It was first pointed out in that the idea 
of domainwall fermions, originally invented to 
shed light on the problem of chiral fermions on 
the lattice, could also be used for simulations 
of lattice QCD. At least for an infinite extent 
of the extra dimension, required in the domain- 
wall fermion approach, the quark mass gets only a 
multiplicative renormalization. Tests in the vec- 
tor Schwinger model 1 39 4Q] seem to indicate that 



in practise one might end up with only a moder- 
ate number of slices of the extra dimension to ap- 
proach the chiral limit. It is, however, too early 
to say, whether domainwall fermions can lead to 
a competitive method for full QCD simulations. 
Supersymmetry 

It is interesting to see that supersymmetry with 
Majorana fermions in the adjoint representation 
is now attacked on the lattice. The pioneering 
work by Montvay used the multiboson tech- 
nique to setup a simulation program. There, a 
particular construction of the polynomial in a two 
step procedure to approximate the Pfaffian was 
given. From the point of view of this review talk 
it is intriguing that the results by Montvay have 
been recalculated by using a Hybrid molecu- 
lar dynamics algorithm and a comparable perfor- 
mance of both algorithms have been found. Of 
course, the Hybrid molecular dynamics approach 
has errors ((5t)^ and one would therefore prefer 
an exact version of the multiboson technique. 

There are additional attempts like Slavnov's 
way of bosonization of the fermion action or 
the approach by Liu and Thron to get rid of the 
pseudofermions altogether |4^. But these works 
are at a very early stage and one has to wait in 
order to see their benefits. The attempts to use 
adaptive step size methods seem not to lead to 
further progress A final but important re- 

mark is that with the multiboson technique sim- 
ulations with Uf = 1 might be feasible [Q. 

4. Reversibility 

In order for the HMC algorithm to be exact, 
the equations of motion used therein have to be 
reversible. Although this is certainly the case for 
a computer with infinite precision arithmetic, in 
daily life one has to face rounding error effects. 



Indeed, it was noticed that in practical simula- 
tions the reversibility condition is not satisfied 
fl^ ]. Moreover, in it was pointed out that 
the equations of motion as used in the HMC al- 
gorithm are chaotic in nature and have a positive 
Liapunov exponent v. 

This implies that rounding errors eventually 
become exponentially amplified. The values of 
the Liapunov exponents have been studied as 
a function of the parameters (3 and k p^ , ^ . 
Whereas the /3 dependence is noticeable, the k 
dependence appears to be rather weak. In it 
was suggested that the Liapunov exponent scales 
like the correlation length ^ such that = const. 
One has to see, whether this interesting hypoth- 
esis will withstand future tests. 

At the moment no quantitative estimate of the 
exponential amplification of rounding errors can 
be given as far as physical observables are con- 
cerned. However, estimates [^,|2| indicate that 
one might face problems on lattice sizes larger 
than 32'' with 32 bit arithmetic. There the re- 
versibility error might reach a level of several per- 
cent. 

5. Symanzik improvement 

During the conference, the use of improved ac- 
tions has certainly been a major topic. The in- 
vestigations of the effects of improvement have so 
far been restricted to the quenched approxima- 
tion. It is a natural and necessary next step, to 
implement improved actions also for dynamical 
fermions. 

Here I want to concentrate only on the 
Symanzik on-shell improvement program 
Following this program leads to the introduction 
of a Sheikholeslami-Wohlert (SW) term as sug- 
gested in Esl : 



Ssw = ^Csw ^ il){x)a^„T^_,u{x)'4){x) 



(5) 



For the implementation of the SW-term in 
the molecular dynamics algorithms, HMC and 
Kramers equation, one needs to find the equations 
of motion while preserving even/odd precondi- 
tioning. These equations can straightforwardly 



be derived |49-5l|. A complete implementation 
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of the SW-term in a dynamical fermion simula- 
tion program has been done in [ pO[ . There also 
possible tests of the code, using strong coupling 
expansions for the plaquette and are given. 
An independent complete implementation for the 
case of finite temperature simulations has been 
done at SCRI ||. 




Figure 3. Ratio of the time needed for the SW- 
action as compared to standard Wilson fermions. 

Of course, the crucial question is, whether the 
overhead of the SW-term as compared to stan- 
dard Wilson fermions appears to be small enough 
to justify its use in practical simulations. In fig. 3 
the time overhead of the SW-action is plotted. 
The true time for a particular implementation 
should lie between the two limiting curves [50| . 
In any case, one notices that for a situation where 
the number of iterations in the inversion routine 
NcG exceeds roughly 250, the overhead is only 
about 20%. One should remark that it is still an 
open question, how the SW-term can be imple- 
mented in a simulation program using the multi- 
boson technique if one wants to keep heatbath or 
over-relaxation methods for the updates. 

6. Conclusion 

Contrary to the situation 3 years ago, we now 
have two exact methods to simulate dynamical 
fermions at our disposal which perform com- 
petitively. The theoretical cost of the multibo- 
son technique expressed in terms of \min and 
V is y ^ogVY I X'^l^^^. This can be compared to 



the cost of the molecular dynamics algorithms 
where one finds yV^'^l>^itu for the HMC and 
VV^I^ jX^J^^ for the Kramers equation algorithm. 
Of course, one should keep in mind that the multi- 
boson technique has a large memory requirement 
and might therefore not be suitable for particular 
machines. 

The real world situation of comparing the two 
kind of algorithms is summarized in fig. 2. It 
demonstrates that for the points investigated so 
far no clear priority for one or the other algorithm 
can be given. It would certainly be important to 
have more data points. I also consider it to be 
urgent to test the theoretical scaling behaviour 
as much as possible in order to see, whether the 
algorithms do what we think they should do. 

We should try to find out what is the magni- 
tude of reversibility violations in the HMC algo- 
rithm and attempt to quantify its implication on 
physical observables. If lack of reversibility ap- 
pears to be a problem with the HMC method, 
one should use the multiboson technique or the 
Kramers equation algorithm. 

Dynamical fermion simulations with Symanzik 
improved actions have already been started. 
Fig. 3 demonstrates that this can be done with 
only a small overhead for simulations that need 
more than about 250 iterations for the matrix in- 
version. If also for dynamical fermions improve- 
ment turns out to be important then this will lead 
to substantial progress in the area of fermion sim- 
ulation algorithms. 
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